Overview

Brought to you by YData

Dataset statistics

Number of variables20
Number of observations2061357
Missing cells2654282
Missing cells (%)6.4%
Duplicate rows6725
Duplicate rows (%)0.3%
Total size in memory314.5 MiB
Average record size in memory160.0 B

Variable types

Text5
Numeric6
Categorical5
DateTime4

Alerts

Dataset has 6725 (0.3%) duplicate rowsDuplicates
arrival_delay_check is highly overall correlated with departure_delay_checkHigh correlation
arrival_delay_m is highly overall correlated with departure_delay_mHigh correlation
departure_delay_check is highly overall correlated with arrival_delay_checkHigh correlation
departure_delay_m is highly overall correlated with arrival_delay_mHigh correlation
eva_nr is highly overall correlated with long and 2 other fieldsHigh correlation
info is highly overall correlated with lat and 3 other fieldsHigh correlation
lat is highly overall correlated with info and 2 other fieldsHigh correlation
long is highly overall correlated with eva_nr and 2 other fieldsHigh correlation
state is highly overall correlated with eva_nr and 4 other fieldsHigh correlation
zip is highly overall correlated with eva_nr and 3 other fieldsHigh correlation
arrival_delay_check is highly imbalanced (69.8%) Imbalance
departure_delay_check is highly imbalanced (69.7%) Imbalance
path has 211355 (10.3%) missing values Missing
arrival_plan has 211355 (10.3%) missing values Missing
arrival_change has 475630 (23.1%) missing values Missing
departure_change has 339926 (16.5%) missing values Missing
info has 1416016 (68.7%) missing values Missing
arrival_delay_m has 1406905 (68.3%) zeros Zeros
departure_delay_m has 1338078 (64.9%) zeros Zeros

Reproduction

Analysis started2024-11-16 19:02:59.691487
Analysis finished2024-11-16 19:04:32.948068
Duration1 minute and 33.26 seconds
Software versionydata-profiling vv4.11.0
Download configurationconfig.json

Variables

ID
Text

Distinct2029894
Distinct (%)98.5%
Missing0
Missing (%)0.0%
Memory size15.7 MiB
2024-11-16T20:04:34.090838image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length34
Median length33
Mean length32.847716
Min length28

Characters and Unicode

Total characters67710870
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1998431 ?
Unique (%)96.9%

Sample

1st row1573967790757085557-2407072312-14
2nd row349781417030375472-2407080017-1
3rd row7157250219775883918-2407072120-25
4th row349781417030375472-2407080017-2
5th row1983158592123451570-2407080010-3
ValueCountFrequency (%)
6858979647319381812-2407090618-14 2
 
< 0.1%
5820508505174735710-2407121206-10 2
 
< 0.1%
5798039258756669712-2407040428-9 2
 
< 0.1%
8529221368551661117-2407100726-11 2
 
< 0.1%
2977632636429677844-2407040423-14 2
 
< 0.1%
3460592075112938158-2407141643-10 2
 
< 0.1%
3544810355449558240-2407040433-3 2
 
< 0.1%
4747130191045816853-2407081511-20 2
 
< 0.1%
3476154881632071886-2407081524-14 2
 
< 0.1%
6289938859164500858-2407040440-4 2
 
< 0.1%
Other values (2029884) 2061337
> 99.9%
2024-11-16T20:04:35.446354image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 8495442
12.5%
0 8383080
12.4%
2 7800385
11.5%
4 7210622
10.6%
7 6524425
9.6%
3 5332792
7.9%
- 5158132
7.6%
8 4909314
7.3%
5 4840335
7.1%
6 4530621
6.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 67710870
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 8495442
12.5%
0 8383080
12.4%
2 7800385
11.5%
4 7210622
10.6%
7 6524425
9.6%
3 5332792
7.9%
- 5158132
7.6%
8 4909314
7.3%
5 4840335
7.1%
6 4530621
6.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 67710870
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 8495442
12.5%
0 8383080
12.4%
2 7800385
11.5%
4 7210622
10.6%
7 6524425
9.6%
3 5332792
7.9%
- 5158132
7.6%
8 4909314
7.3%
5 4840335
7.1%
6 4530621
6.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 67710870
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 8495442
12.5%
0 8383080
12.4%
2 7800385
11.5%
4 7210622
10.6%
7 6524425
9.6%
3 5332792
7.9%
- 5158132
7.6%
8 4909314
7.3%
5 4840335
7.1%
6 4530621
6.7%

line
Text

Distinct296
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.7 MiB
2024-11-16T20:04:35.729774image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length5
Median length1
Mean length1.607087
Min length1

Characters and Unicode

Total characters3312780
Distinct characters33
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20
2nd row18
3rd row1
4th row18
5th row33
ValueCountFrequency (%)
1 327551
 
15.9%
2 170917
 
8.3%
3 167354
 
8.1%
6 117441
 
5.7%
5 102976
 
5.0%
8 98876
 
4.8%
7 75570
 
3.7%
4 71711
 
3.5%
9 65965
 
3.2%
42 42461
 
2.1%
Other values (284) 820552
39.8%
2024-11-16T20:04:36.145749image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 643478
19.4%
2 401434
12.1%
3 316557
9.6%
4 283984
8.6%
5 272514
8.2%
6 263947
8.0%
8 206967
 
6.2%
7 195939
 
5.9%
R 194774
 
5.9%
9 139356
 
4.2%
Other values (23) 393830
11.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 3312780
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 643478
19.4%
2 401434
12.1%
3 316557
9.6%
4 283984
8.6%
5 272514
8.2%
6 263947
8.0%
8 206967
 
6.2%
7 195939
 
5.9%
R 194774
 
5.9%
9 139356
 
4.2%
Other values (23) 393830
11.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 3312780
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 643478
19.4%
2 401434
12.1%
3 316557
9.6%
4 283984
8.6%
5 272514
8.2%
6 263947
8.0%
8 206967
 
6.2%
7 195939
 
5.9%
R 194774
 
5.9%
9 139356
 
4.2%
Other values (23) 393830
11.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 3312780
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 643478
19.4%
2 401434
12.1%
3 316557
9.6%
4 283984
8.6%
5 272514
8.2%
6 263947
8.0%
8 206967
 
6.2%
7 195939
 
5.9%
R 194774
 
5.9%
9 139356
 
4.2%
Other values (23) 393830
11.9%

path
Text

Missing 

Distinct22153
Distinct (%)1.2%
Missing211355
Missing (%)10.3%
Memory size15.7 MiB
2024-11-16T20:04:36.369182image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length1229
Median length626
Mean length181.48341
Min length4

Characters and Unicode

Total characters335744675
Distinct characters77
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1118 ?
Unique (%)0.1%

Sample

1st rowStolberg(Rheinl)Hbf Gl.44|Eschweiler-St.Jöris|Alsdorf Poststraße|Alsdorf-Mariadorf|Alsdorf-Kellersberg|Alsdorf-Annapark|Alsdorf-Busch|Herzogenrath-August-Schmidt-Platz|Herzogenrath-Alt-Merkstein|Herzogenrath|Kohlscheid|Aachen West|Aachen Schanz
2nd rowHamm(Westf)Hbf|Kamen|Kamen-Methler|Dortmund-Kurl|Dortmund-Scharnhorst|Dortmund Hbf|Bochum Hbf|Wattenscheid|Essen Hbf|Mülheim(Ruhr)Hbf|Duisburg Hbf|Düsseldorf Flughafen|Düsseldorf Hbf|Düsseldorf-Benrath|Leverkusen Mitte|Köln-Mülheim|Köln Messe/Deutz|Köln Hbf|Köln-Ehrenfeld|Horrem|Düren|Langerwehe|Eschweiler Hbf|Stolberg(Rheinl)Hbf
3rd rowAachen Hbf
4th rowHerzogenrath|Kohlscheid
5th rowHerzogenrath
ValueCountFrequency (%)
hbf 456947
 
3.4%
s)|berlin 450434
 
3.4%
allee|berlin 163575
 
1.2%
berlin 111434
 
0.8%
friedrichstraße 96045
 
0.7%
straße|berlin 90394
 
0.7%
am 89539
 
0.7%
flughafen 86429
 
0.7%
ostkreuz 85308
 
0.6%
rosenheimer 81857
 
0.6%
Other values (13360) 11560324
87.1%
2024-11-16T20:04:36.696677image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 36026888
 
10.7%
r 26440854
 
7.9%
n 25409619
 
7.6%
a 17843776
 
5.3%
| 17652750
 
5.3%
i 15166799
 
4.5%
l 15129247
 
4.5%
t 14062373
 
4.2%
s 12328613
 
3.7%
h 11798194
 
3.5%
Other values (67) 143885562
42.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 335744675
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 36026888
 
10.7%
r 26440854
 
7.9%
n 25409619
 
7.6%
a 17843776
 
5.3%
| 17652750
 
5.3%
i 15166799
 
4.5%
l 15129247
 
4.5%
t 14062373
 
4.2%
s 12328613
 
3.7%
h 11798194
 
3.5%
Other values (67) 143885562
42.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 335744675
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 36026888
 
10.7%
r 26440854
 
7.9%
n 25409619
 
7.6%
a 17843776
 
5.3%
| 17652750
 
5.3%
i 15166799
 
4.5%
l 15129247
 
4.5%
t 14062373
 
4.2%
s 12328613
 
3.7%
h 11798194
 
3.5%
Other values (67) 143885562
42.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 335744675
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 36026888
 
10.7%
r 26440854
 
7.9%
n 25409619
 
7.6%
a 17843776
 
5.3%
| 17652750
 
5.3%
i 15166799
 
4.5%
l 15129247
 
4.5%
t 14062373
 
4.2%
s 12328613
 
3.7%
h 11798194
 
3.5%
Other values (67) 143885562
42.9%

eva_nr
Real number (ℝ)

High correlation 

Distinct1996
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8018266.8
Minimum8000001
Maximum8098360
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.7 MiB
2024-11-16T20:04:36.792671image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum8000001
5-th percentile8000105
Q18001582
median8004136
Q38010208
95-th percentile8089080
Maximum8098360
Range98359
Interquartile range (IQR)8626

Descriptive statistics

Standard deviation31786.593
Coefficient of variation (CV)0.0039642724
Kurtosis1.0111613
Mean8018266.8
Median Absolute Deviation (MAD)3054
Skewness1.7122051
Sum1.652851 × 1013
Variance1.0103875 × 109
MonotonicityNot monotonic
2024-11-16T20:04:36.881015image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8004128 8732
 
0.4%
8089047 8312
 
0.4%
8000262 7814
 
0.4%
8004132 7598
 
0.4%
8004131 7382
 
0.4%
8004135 7378
 
0.4%
8004129 7366
 
0.4%
8004136 7324
 
0.4%
8089045 7124
 
0.3%
8003368 6828
 
0.3%
Other values (1986) 1985499
96.3%
ValueCountFrequency (%)
8000001 1488
0.1%
8000002 823
 
< 0.1%
8000004 848
 
< 0.1%
8000007 591
 
< 0.1%
8000009 829
 
< 0.1%
8000010 946
< 0.1%
8000011 589
 
< 0.1%
8000012 896
 
< 0.1%
8000013 2337
0.1%
8000014 756
 
< 0.1%
ValueCountFrequency (%)
8098360 534
 
< 0.1%
8089537 2191
 
0.1%
8089474 5831
0.3%
8089473 1536
 
0.1%
8089472 1544
 
0.1%
8089331 1690
 
0.1%
8089330 1923
 
0.1%
8089329 1811
 
0.1%
8089328 1941
 
0.1%
8089327 2781
0.1%

category
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.7 MiB
4
788039 
5
643822 
3
421535 
2
137222 
1
 
70739

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2061357
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row4
4th row5
5th row5

Common Values

ValueCountFrequency (%)
4 788039
38.2%
5 643822
31.2%
3 421535
20.4%
2 137222
 
6.7%
1 70739
 
3.4%

Length

2024-11-16T20:04:36.963783image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-16T20:04:37.058739image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
4 788039
38.2%
5 643822
31.2%
3 421535
20.4%
2 137222
 
6.7%
1 70739
 
3.4%

Most occurring characters

ValueCountFrequency (%)
4 788039
38.2%
5 643822
31.2%
3 421535
20.4%
2 137222
 
6.7%
1 70739
 
3.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2061357
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
4 788039
38.2%
5 643822
31.2%
3 421535
20.4%
2 137222
 
6.7%
1 70739
 
3.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2061357
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
4 788039
38.2%
5 643822
31.2%
3 421535
20.4%
2 137222
 
6.7%
1 70739
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2061357
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
4 788039
38.2%
5 643822
31.2%
3 421535
20.4%
2 137222
 
6.7%
1 70739
 
3.4%
Distinct1996
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.7 MiB
2024-11-16T20:04:37.258952image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length42
Median length30
Mean length14.651004
Min length4

Characters and Unicode

Total characters30200949
Distinct characters63
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAachen Hbf
2nd rowAachen Hbf
3rd rowAachen-Rothe Erde
4th rowAachen West
5th rowAachen West
ValueCountFrequency (%)
hbf 187162
 
5.9%
münchen 63047
 
2.0%
main 62447
 
2.0%
frankfurt 54578
 
1.7%
straße 39439
 
1.2%
berlin 34076
 
1.1%
stuttgart 27326
 
0.9%
bad 27081
 
0.8%
köln 25774
 
0.8%
ost 25336
 
0.8%
Other values (2079) 2644502
82.9%
2024-11-16T20:04:37.532253image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 3529648
 
11.7%
r 2333415
 
7.7%
n 2330307
 
7.7%
a 1806903
 
6.0%
t 1459870
 
4.8%
i 1324360
 
4.4%
l 1291474
 
4.3%
s 1288466
 
4.3%
h 1201107
 
4.0%
1129411
 
3.7%
Other values (53) 12505988
41.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 30200949
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 3529648
 
11.7%
r 2333415
 
7.7%
n 2330307
 
7.7%
a 1806903
 
6.0%
t 1459870
 
4.8%
i 1324360
 
4.4%
l 1291474
 
4.3%
s 1288466
 
4.3%
h 1201107
 
4.0%
1129411
 
3.7%
Other values (53) 12505988
41.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 30200949
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 3529648
 
11.7%
r 2333415
 
7.7%
n 2330307
 
7.7%
a 1806903
 
6.0%
t 1459870
 
4.8%
i 1324360
 
4.4%
l 1291474
 
4.3%
s 1288466
 
4.3%
h 1201107
 
4.0%
1129411
 
3.7%
Other values (53) 12505988
41.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 30200949
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 3529648
 
11.7%
r 2333415
 
7.7%
n 2330307
 
7.7%
a 1806903
 
6.0%
t 1459870
 
4.8%
i 1324360
 
4.4%
l 1291474
 
4.3%
s 1288466
 
4.3%
h 1201107
 
4.0%
1129411
 
3.7%
Other values (53) 12505988
41.4%

state
Categorical

High correlation 

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.7 MiB
Nordrhein-Westfalen
342957 
Berlin
334845 
Bayern
330381 
Baden-Württemberg
253224 
Hessen
200308 
Other values (11)
599642 

Length

Max length22
Median length19
Mean length10.957078
Min length6

Characters and Unicode

Total characters22586450
Distinct characters35
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNordrhein-Westfalen
2nd rowNordrhein-Westfalen
3rd rowNordrhein-Westfalen
4th rowNordrhein-Westfalen
5th rowNordrhein-Westfalen

Common Values

ValueCountFrequency (%)
Nordrhein-Westfalen 342957
16.6%
Berlin 334845
16.2%
Bayern 330381
16.0%
Baden-Württemberg 253224
12.3%
Hessen 200308
9.7%
Hamburg 154982
7.5%
Sachsen 84791
 
4.1%
Niedersachsen 82767
 
4.0%
Rheinland-Pfalz 78941
 
3.8%
Brandenburg 58961
 
2.9%
Other values (6) 139200
6.8%

Length

2024-11-16T20:04:37.626097image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
nordrhein-westfalen 342957
16.6%
berlin 334845
16.2%
bayern 330381
16.0%
baden-württemberg 253224
12.3%
hessen 200308
9.7%
hamburg 154982
7.5%
sachsen 84791
 
4.1%
niedersachsen 82767
 
4.0%
rheinland-pfalz 78941
 
3.8%
brandenburg 58961
 
2.9%
Other values (6) 139200
6.8%

Most occurring characters

ValueCountFrequency (%)
e 3540070
15.7%
n 2462639
 
10.9%
r 2331538
 
10.3%
a 1579992
 
7.0%
s 1097165
 
4.9%
B 987839
 
4.4%
l 979365
 
4.3%
i 932853
 
4.1%
t 916588
 
4.1%
d 834133
 
3.7%
Other values (25) 6924268
30.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 22586450
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 3540070
15.7%
n 2462639
 
10.9%
r 2331538
 
10.3%
a 1579992
 
7.0%
s 1097165
 
4.9%
B 987839
 
4.4%
l 979365
 
4.3%
i 932853
 
4.1%
t 916588
 
4.1%
d 834133
 
3.7%
Other values (25) 6924268
30.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 22586450
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 3540070
15.7%
n 2462639
 
10.9%
r 2331538
 
10.3%
a 1579992
 
7.0%
s 1097165
 
4.9%
B 987839
 
4.4%
l 979365
 
4.3%
i 932853
 
4.1%
t 916588
 
4.1%
d 834133
 
3.7%
Other values (25) 6924268
30.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 22586450
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 3540070
15.7%
n 2462639
 
10.9%
r 2331538
 
10.3%
a 1579992
 
7.0%
s 1097165
 
4.9%
B 987839
 
4.4%
l 979365
 
4.3%
i 932853
 
4.1%
t 916588
 
4.1%
d 834133
 
3.7%
Other values (25) 6924268
30.7%

city
Text

Distinct1292
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.7 MiB
2024-11-16T20:04:37.775153image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length25
Median length23
Mean length8.9921067
Min length3

Characters and Unicode

Total characters18535942
Distinct characters60
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAachen
2nd rowAachen
3rd rowAachen
4th rowAachen
5th rowAachen
ValueCountFrequency (%)
berlin 336007
 
13.8%
hamburg 154982
 
6.4%
münchen 118076
 
4.9%
main 87750
 
3.6%
am 82347
 
3.4%
frankfurt 69216
 
2.9%
köln 42898
 
1.8%
stuttgart 41626
 
1.7%
düsseldorf 38329
 
1.6%
bad 28402
 
1.2%
Other values (1345) 1427171
58.8%
2024-11-16T20:04:38.013103image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 2104898
 
11.4%
n 1844843
 
10.0%
r 1622878
 
8.8%
a 1151914
 
6.2%
i 1088569
 
5.9%
l 884751
 
4.8%
t 713605
 
3.8%
u 688424
 
3.7%
h 639238
 
3.4%
g 636523
 
3.4%
Other values (50) 7160299
38.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 18535942
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 2104898
 
11.4%
n 1844843
 
10.0%
r 1622878
 
8.8%
a 1151914
 
6.2%
i 1088569
 
5.9%
l 884751
 
4.8%
t 713605
 
3.8%
u 688424
 
3.7%
h 639238
 
3.4%
g 636523
 
3.4%
Other values (50) 7160299
38.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 18535942
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 2104898
 
11.4%
n 1844843
 
10.0%
r 1622878
 
8.8%
a 1151914
 
6.2%
i 1088569
 
5.9%
l 884751
 
4.8%
t 713605
 
3.8%
u 688424
 
3.7%
h 639238
 
3.4%
g 636523
 
3.4%
Other values (50) 7160299
38.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 18535942
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 2104898
 
11.4%
n 1844843
 
10.0%
r 1622878
 
8.8%
a 1151914
 
6.2%
i 1088569
 
5.9%
l 884751
 
4.8%
t 713605
 
3.8%
u 688424
 
3.7%
h 639238
 
3.4%
g 636523
 
3.4%
Other values (50) 7160299
38.6%

zip
Real number (ℝ)

High correlation 

Distinct1651
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean46279.239
Minimum1067
Maximum99974
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.7 MiB
2024-11-16T20:04:38.101589image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1067
5-th percentile7745
Q118109
median47051
Q370806
95-th percentile88427
Maximum99974
Range98907
Interquartile range (IQR)52697

Descriptive statistics

Standard deviation28214.243
Coefficient of variation (CV)0.60965226
Kurtosis-1.3681252
Mean46279.239
Median Absolute Deviation (MAD)26211
Skewness0.045692967
Sum9.5398033 × 1010
Variance7.9604349 × 108
MonotonicityNot monotonic
2024-11-16T20:04:38.193921image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
80331 22358
 
1.1%
80639 13233
 
0.6%
10557 12991
 
0.6%
14057 11976
 
0.6%
10117 11693
 
0.6%
10827 11687
 
0.6%
60313 11368
 
0.6%
22525 9877
 
0.5%
10317 9655
 
0.5%
20354 9641
 
0.5%
Other values (1641) 1936878
94.0%
ValueCountFrequency (%)
1067 2458
0.1%
1069 2045
0.1%
1097 3305
0.2%
1109 1800
0.1%
1127 597
 
< 0.1%
1129 1882
0.1%
1159 988
 
< 0.1%
1187 566
 
< 0.1%
1219 917
 
< 0.1%
1237 1944
0.1%
ValueCountFrequency (%)
99974 421
< 0.1%
99947 453
< 0.1%
99880 424
< 0.1%
99867 494
< 0.1%
99817 453
< 0.1%
99752 252
< 0.1%
99734 497
< 0.1%
99610 354
< 0.1%
99518 279
< 0.1%
99510 360
< 0.1%

long
Real number (ℝ)

High correlation 

Distinct1995
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.183673
Minimum6.070715
Maximum14.97908
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.7 MiB
2024-11-16T20:04:38.287516image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum6.070715
5-th percentile6.815137
Q18.494709
median9.944088
Q312.090548
95-th percentile13.513799
Maximum14.97908
Range8.908365
Interquartile range (IQR)3.595839

Descriptive statistics

Standard deviation2.2735246
Coefficient of variation (CV)0.22325193
Kurtosis-1.2261664
Mean10.183673
Median Absolute Deviation (MAD)1.694189
Skewness0.11311638
Sum20992186
Variance5.1689143
MonotonicityNot monotonic
2024-11-16T20:04:38.376731image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11.536537 8732
 
0.4%
13.283966 8312
 
0.4%
11.604971 7814
 
0.4%
11.565619 7598
 
0.4%
11.583234 7382
 
0.4%
11.575386 7378
 
0.4%
11.548572 7366
 
0.4%
11.593049 7324
 
0.4%
13.451646 7124
 
0.3%
6.975001 6828
 
0.3%
Other values (1985) 1985499
96.3%
ValueCountFrequency (%)
6.070715 1744
0.1%
6.07384 1213
0.1%
6.074485 1051
0.1%
6.091499 1488
0.1%
6.094486 1899
0.1%
6.097265 820
< 0.1%
6.116475 949
< 0.1%
6.124518 818
< 0.1%
6.203225 252
 
< 0.1%
6.207467 717
 
< 0.1%
ValueCountFrequency (%)
14.97908 608
< 0.1%
14.902088 272
 
< 0.1%
14.805774 578
< 0.1%
14.706775 348
< 0.1%
14.671941 461
< 0.1%
14.658435 480
< 0.1%
14.648866 266
 
< 0.1%
14.638027 267
 
< 0.1%
14.578802 280
 
< 0.1%
14.546496 716
< 0.1%

lat
Real number (ℝ)

High correlation 

Distinct1996
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.8824
Minimum47.411032
Maximum54.906839
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.7 MiB
2024-11-16T20:04:38.467615image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum47.411032
5-th percentile48.111413
Q149.353291
median51.087456
Q352.478542
95-th percentile53.564711
Maximum54.906839
Range7.495807
Interquartile range (IQR)3.125251

Descriptive statistics

Standard deviation1.7922171
Coefficient of variation (CV)0.035222731
Kurtosis-1.1318607
Mean50.8824
Median Absolute Deviation (MAD)1.42371
Skewness-0.11825504
Sum1.0488679 × 108
Variance3.2120421
MonotonicityNot monotonic
2024-11-16T20:04:38.555105image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
48.142623 8732
 
0.4%
52.500737 8312
 
0.4%
48.12744 7814
 
0.4%
48.139452 7598
 
0.4%
48.134202 7382
 
0.4%
48.137048 7378
 
0.4%
48.141969 7366
 
0.4%
48.129168 7324
 
0.4%
52.505976 7124
 
0.3%
50.940874 6828
 
0.3%
Other values (1986) 1985499
96.3%
ValueCountFrequency (%)
47.411032 221
 
< 0.1%
47.44003 237
 
< 0.1%
47.456591 449
< 0.1%
47.491452 419
< 0.1%
47.513241 474
< 0.1%
47.544341 565
< 0.1%
47.5509 368
< 0.1%
47.552384 874
< 0.1%
47.555857 484
< 0.1%
47.556923 610
< 0.1%
ValueCountFrequency (%)
54.906839 234
< 0.1%
54.888814 364
< 0.1%
54.872142 371
< 0.1%
54.861997 381
< 0.1%
54.789605 565
< 0.1%
54.774039 281
< 0.1%
54.685934 373
< 0.1%
54.621166 311
< 0.1%
54.499457 517
< 0.1%
54.4720826 537
< 0.1%

arrival_plan
Date

Missing 

Distinct10084
Distinct (%)0.5%
Missing211355
Missing (%)10.3%
Memory size15.7 MiB
Minimum2024-07-07 23:37:00
Maximum2024-07-14 23:59:00
2024-11-16T20:04:38.646420image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:38.741805image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct10089
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size15.7 MiB
Minimum2024-07-08 00:00:00
Maximum2024-07-15 00:10:00
2024-11-16T20:04:38.829319image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:38.913763image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

arrival_change
Date

Missing 

Distinct10122
Distinct (%)0.6%
Missing475630
Missing (%)23.1%
Memory size15.7 MiB
Minimum2024-07-07 23:39:00
Maximum2024-07-15 01:03:00
2024-11-16T20:04:39.001278image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:39.091807image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

departure_change
Date

Missing 

Distinct10118
Distinct (%)0.6%
Missing339926
Missing (%)16.5%
Memory size15.7 MiB
Minimum2024-07-08 00:00:00
Maximum2024-07-15 01:04:00
2024-11-16T20:04:39.180041image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:39.265026image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

arrival_delay_m
Real number (ℝ)

High correlation  Zeros 

Distinct116
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.1765808
Minimum0
Maximum159
Zeros1406905
Zeros (%)68.3%
Negative0
Negative (%)0.0%
Memory size15.7 MiB
2024-11-16T20:04:39.365588image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile6
Maximum159
Range159
Interquartile range (IQR)1

Descriptive statistics

Standard deviation3.4078587
Coefficient of variation (CV)2.8964086
Kurtosis107.13566
Mean1.1765808
Median Absolute Deviation (MAD)0
Skewness7.6803854
Sum2425353
Variance11.613501
MonotonicityNot monotonic
2024-11-16T20:04:39.477737image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1406905
68.3%
1 255045
 
12.4%
2 130459
 
6.3%
3 80446
 
3.9%
4 46109
 
2.2%
5 31440
 
1.5%
6 21950
 
1.1%
7 15765
 
0.8%
8 12246
 
0.6%
9 9856
 
0.5%
Other values (106) 51136
 
2.5%
ValueCountFrequency (%)
0 1406905
68.3%
1 255045
 
12.4%
2 130459
 
6.3%
3 80446
 
3.9%
4 46109
 
2.2%
5 31440
 
1.5%
6 21950
 
1.1%
7 15765
 
0.8%
8 12246
 
0.6%
9 9856
 
0.5%
ValueCountFrequency (%)
159 1
 
< 0.1%
157 2
< 0.1%
140 1
 
< 0.1%
136 1
 
< 0.1%
134 1
 
< 0.1%
133 3
< 0.1%
132 1
 
< 0.1%
120 1
 
< 0.1%
117 1
 
< 0.1%
116 3
< 0.1%

departure_delay_m
Real number (ℝ)

High correlation  Zeros 

Distinct121
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.2235736
Minimum0
Maximum159
Zeros1338078
Zeros (%)64.9%
Negative0
Negative (%)0.0%
Memory size15.7 MiB
2024-11-16T20:04:39.600516image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile6
Maximum159
Range159
Interquartile range (IQR)1

Descriptive statistics

Standard deviation3.4183003
Coefficient of variation (CV)2.7937023
Kurtosis107.25174
Mean1.2235736
Median Absolute Deviation (MAD)0
Skewness7.6751063
Sum2522222
Variance11.684777
MonotonicityNot monotonic
2024-11-16T20:04:39.687075image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1338078
64.9%
1 306206
 
14.9%
2 146365
 
7.1%
3 81512
 
4.0%
4 46542
 
2.3%
5 31306
 
1.5%
6 21750
 
1.1%
7 15715
 
0.8%
8 12283
 
0.6%
9 9775
 
0.5%
Other values (111) 51825
 
2.5%
ValueCountFrequency (%)
0 1338078
64.9%
1 306206
 
14.9%
2 146365
 
7.1%
3 81512
 
4.0%
4 46542
 
2.3%
5 31306
 
1.5%
6 21750
 
1.1%
7 15715
 
0.8%
8 12283
 
0.6%
9 9775
 
0.5%
ValueCountFrequency (%)
159 1
< 0.1%
157 1
< 0.1%
156 1
< 0.1%
137 1
< 0.1%
135 1
< 0.1%
134 2
< 0.1%
133 1
< 0.1%
132 2
< 0.1%
131 1
< 0.1%
120 1
< 0.1%

info
Categorical

High correlation  Missing 

Distinct7
Distinct (%)< 0.1%
Missing1416016
Missing (%)68.7%
Memory size15.7 MiB
Information
244033 
Störung
116325 
Bauarbeiten
96301 
Information. (Quelle: zuginfo.nrw)
78977 
Bauarbeiten. (Quelle: zuginfo.nrw)
72555 
Other values (2)
37150 

Length

Max length34
Median length11
Mean length16.525872
Min length7

Characters and Unicode

Total characters10664823
Distinct characters28
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBauarbeiten. (Quelle: zuginfo.nrw)
2nd rowInformation
3rd rowInformation
4th rowInformation
5th rowInformation

Common Values

ValueCountFrequency (%)
Information 244033
 
11.8%
Störung 116325
 
5.6%
Bauarbeiten 96301
 
4.7%
Information. (Quelle: zuginfo.nrw) 78977
 
3.8%
Bauarbeiten. (Quelle: zuginfo.nrw) 72555
 
3.5%
Störung. (Quelle: zuginfo.nrw) 28744
 
1.4%
Großstörung 8406
 
0.4%
(Missing) 1416016
68.7%

Length

2024-11-16T20:04:39.770561image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-16T20:04:39.850606image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
information 323010
32.1%
quelle 180276
17.9%
zuginfo.nrw 180276
17.9%
bauarbeiten 168856
16.8%
störung 145069
14.4%
großstörung 8406
 
0.8%

Most occurring characters

ValueCountFrequency (%)
n 1328903
 
12.5%
o 834702
 
7.8%
r 834023
 
7.8%
e 698264
 
6.5%
u 682883
 
6.4%
i 672142
 
6.3%
a 660722
 
6.2%
t 645341
 
6.1%
f 503286
 
4.7%
l 360552
 
3.4%
Other values (18) 3444005
32.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 10664823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
n 1328903
 
12.5%
o 834702
 
7.8%
r 834023
 
7.8%
e 698264
 
6.5%
u 682883
 
6.4%
i 672142
 
6.3%
a 660722
 
6.2%
t 645341
 
6.1%
f 503286
 
4.7%
l 360552
 
3.4%
Other values (18) 3444005
32.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 10664823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
n 1328903
 
12.5%
o 834702
 
7.8%
r 834023
 
7.8%
e 698264
 
6.5%
u 682883
 
6.4%
i 672142
 
6.3%
a 660722
 
6.2%
t 645341
 
6.1%
f 503286
 
4.7%
l 360552
 
3.4%
Other values (18) 3444005
32.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 10664823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
n 1328903
 
12.5%
o 834702
 
7.8%
r 834023
 
7.8%
e 698264
 
6.5%
u 682883
 
6.4%
i 672142
 
6.3%
a 660722
 
6.2%
t 645341
 
6.1%
f 503286
 
4.7%
l 360552
 
3.4%
Other values (18) 3444005
32.3%

arrival_delay_check
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.7 MiB
on_time
1950404 
delay
 
110953

Length

Max length7
Median length7
Mean length6.8923496
Min length5

Characters and Unicode

Total characters14207593
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowon_time
2nd rowon_time
3rd rowon_time
4th rowon_time
5th rowon_time

Common Values

ValueCountFrequency (%)
on_time 1950404
94.6%
delay 110953
 
5.4%

Length

2024-11-16T20:04:39.943985image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-16T20:04:40.013076image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
on_time 1950404
94.6%
delay 110953
 
5.4%

Most occurring characters

ValueCountFrequency (%)
e 2061357
14.5%
n 1950404
13.7%
o 1950404
13.7%
_ 1950404
13.7%
t 1950404
13.7%
i 1950404
13.7%
m 1950404
13.7%
d 110953
 
0.8%
l 110953
 
0.8%
a 110953
 
0.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 14207593
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 2061357
14.5%
n 1950404
13.7%
o 1950404
13.7%
_ 1950404
13.7%
t 1950404
13.7%
i 1950404
13.7%
m 1950404
13.7%
d 110953
 
0.8%
l 110953
 
0.8%
a 110953
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 14207593
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 2061357
14.5%
n 1950404
13.7%
o 1950404
13.7%
_ 1950404
13.7%
t 1950404
13.7%
i 1950404
13.7%
m 1950404
13.7%
d 110953
 
0.8%
l 110953
 
0.8%
a 110953
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 14207593
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 2061357
14.5%
n 1950404
13.7%
o 1950404
13.7%
_ 1950404
13.7%
t 1950404
13.7%
i 1950404
13.7%
m 1950404
13.7%
d 110953
 
0.8%
l 110953
 
0.8%
a 110953
 
0.8%

departure_delay_check
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.7 MiB
on_time
1950009 
delay
 
111348

Length

Max length7
Median length7
Mean length6.8919663
Min length5

Characters and Unicode

Total characters14206803
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowon_time
2nd rowon_time
3rd rowon_time
4th rowon_time
5th rowon_time

Common Values

ValueCountFrequency (%)
on_time 1950009
94.6%
delay 111348
 
5.4%

Length

2024-11-16T20:04:40.083805image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-16T20:04:40.150962image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
on_time 1950009
94.6%
delay 111348
 
5.4%

Most occurring characters

ValueCountFrequency (%)
e 2061357
14.5%
n 1950009
13.7%
o 1950009
13.7%
_ 1950009
13.7%
t 1950009
13.7%
i 1950009
13.7%
m 1950009
13.7%
d 111348
 
0.8%
l 111348
 
0.8%
a 111348
 
0.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 14206803
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 2061357
14.5%
n 1950009
13.7%
o 1950009
13.7%
_ 1950009
13.7%
t 1950009
13.7%
i 1950009
13.7%
m 1950009
13.7%
d 111348
 
0.8%
l 111348
 
0.8%
a 111348
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 14206803
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 2061357
14.5%
n 1950009
13.7%
o 1950009
13.7%
_ 1950009
13.7%
t 1950009
13.7%
i 1950009
13.7%
m 1950009
13.7%
d 111348
 
0.8%
l 111348
 
0.8%
a 111348
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 14206803
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 2061357
14.5%
n 1950009
13.7%
o 1950009
13.7%
_ 1950009
13.7%
t 1950009
13.7%
i 1950009
13.7%
m 1950009
13.7%
d 111348
 
0.8%
l 111348
 
0.8%
a 111348
 
0.8%

Interactions

2024-11-16T20:04:19.751570image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:10.637978image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:12.194435image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:14.167820image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:15.820663image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:17.766405image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:20.063824image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:10.891116image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:12.443542image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:14.479934image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:16.089876image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:18.093900image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:20.368894image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:11.144679image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:12.713658image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:14.727043image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:16.477057image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:18.461234image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:20.736479image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:11.403124image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:12.978442image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:14.982968image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:16.755457image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:18.795010image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:21.050129image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:11.669609image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:13.254936image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:15.263827image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:17.069325image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:19.109003image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:21.357829image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:11.927313image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:13.796433image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:15.563175image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:17.435408image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-16T20:04:19.428897image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Correlations

2024-11-16T20:04:40.197795image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
arrival_delay_checkarrival_delay_mcategorydeparture_delay_checkdeparture_delay_meva_nrinfolatlongstatezip
arrival_delay_check1.0000.4330.0310.9100.4140.0860.0860.0960.1020.1140.107
arrival_delay_m0.4331.0000.0100.4280.823-0.0860.027-0.251-0.1050.0190.226
category0.0310.0101.0000.0320.0110.1630.1270.1760.1750.2090.158
departure_delay_check0.9100.4280.0321.0000.4340.0870.0850.0960.1020.1140.107
departure_delay_m0.4140.8230.0110.4341.000-0.0940.027-0.270-0.1110.0190.245
eva_nr0.086-0.0860.1630.087-0.0941.0000.3140.3480.6540.706-0.531
info0.0860.0270.1270.0850.0270.3141.0000.5150.5630.5770.540
lat0.096-0.2510.1760.096-0.2700.3480.5151.0000.2580.688-0.833
long0.102-0.1050.1750.102-0.1110.6540.5630.2581.0000.650-0.410
state0.1140.0190.2090.1140.0190.7060.5770.6880.6501.0000.696
zip0.1070.2260.1580.1070.245-0.5310.540-0.833-0.4100.6961.000

Missing values

2024-11-16T20:04:22.155640image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-11-16T20:04:24.863860image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-11-16T20:04:29.795824image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

IDlinepatheva_nrcategorystationstatecityziplonglatarrival_plandeparture_planarrival_changedeparture_changearrival_delay_mdeparture_delay_minfoarrival_delay_checkdeparture_delay_check
01573967790757085557-2407072312-1420Stolberg(Rheinl)Hbf Gl.44|Eschweiler-St.Jöris|Alsdorf Poststraße|Alsdorf-Mariadorf|Alsdorf-Kellersberg|Alsdorf-Annapark|Alsdorf-Busch|Herzogenrath-August-Schmidt-Platz|Herzogenrath-Alt-Merkstein|Herzogenrath|Kohlscheid|Aachen West|Aachen Schanz80000012Aachen HbfNordrhein-WestfalenAachen520646.09149950.7678002024-07-08 00:00:002024-07-08 00:01:002024-07-08 00:03:002024-07-08 00:04:0033NaNon_timeon_time
1349781417030375472-2407080017-118NaN80000012Aachen HbfNordrhein-WestfalenAachen520646.09149950.767800NaN2024-07-08 00:17:00NaNNaN00NaNon_timeon_time
27157250219775883918-2407072120-251Hamm(Westf)Hbf|Kamen|Kamen-Methler|Dortmund-Kurl|Dortmund-Scharnhorst|Dortmund Hbf|Bochum Hbf|Wattenscheid|Essen Hbf|Mülheim(Ruhr)Hbf|Duisburg Hbf|Düsseldorf Flughafen|Düsseldorf Hbf|Düsseldorf-Benrath|Leverkusen Mitte|Köln-Mülheim|Köln Messe/Deutz|Köln Hbf|Köln-Ehrenfeld|Horrem|Düren|Langerwehe|Eschweiler Hbf|Stolberg(Rheinl)Hbf80004064Aachen-Rothe ErdeNordrhein-WestfalenAachen520666.11647550.7702022024-07-08 00:03:002024-07-08 00:04:002024-07-08 00:03:002024-07-08 00:04:0000NaNon_timeon_time
3349781417030375472-2407080017-218Aachen Hbf80004045Aachen WestNordrhein-WestfalenAachen520726.07071550.7803602024-07-08 00:20:002024-07-08 00:21:00NaNNaN00NaNon_timeon_time
41983158592123451570-2407080010-333Herzogenrath|Kohlscheid80004045Aachen WestNordrhein-WestfalenAachen520726.07071550.7803602024-07-08 00:20:002024-07-08 00:21:002024-07-08 00:20:002024-07-08 00:21:0000NaNon_timeon_time
5-5293934437045765939-2407080023-24Herzogenrath80004045Aachen WestNordrhein-WestfalenAachen520726.07071550.7803602024-07-08 00:30:002024-07-08 00:31:002024-07-08 00:30:002024-07-08 00:31:0000Bauarbeiten. (Quelle: zuginfo.nrw)on_timeon_time
66845762881043426854-2407072357-6RB33Lindern|Geilenkirchen|Übach-Palenberg|Herzogenrath|Kohlscheid80004045Aachen WestNordrhein-WestfalenAachen520726.07071550.7803602024-07-08 00:58:002024-07-08 00:58:00NaNNaN00NaNon_timeon_time
7-2100556839975301087-2407072307-1318Liège-Guillemins|Bressoux|Vise|Eijsden|Maastricht Randwyck|Maastricht|Meerssen|Valkenburg(NL)|Heerlen|Landgraaf|Eygelshoven Markt|Herzogenrath80004045Aachen WestNordrhein-WestfalenAachen520726.07071550.7803602024-07-08 00:37:002024-07-08 00:41:002024-07-08 00:37:002024-07-08 00:41:0000NaNon_timeon_time
8-7696913984968518161-2407080037-113NaN80000023Aalen HbfBaden-WürttembergAalen7343010.09627148.841013NaN2024-07-08 00:37:00NaN2024-07-08 00:37:0000Informationon_timeon_time
9-6027587483204218492-2407080013-48Bremen Hbf|Bremen-Sebaldsbrück|Bremen-Mahndorf80004134AchimNiedersachsenAchim288329.03044753.0159902024-07-08 00:27:002024-07-08 00:27:002024-07-08 01:16:002024-07-08 01:17:004950NaNdelaydelay
IDlinepatheva_nrcategorystationstatecityziplonglatarrival_plandeparture_planarrival_changedeparture_changearrival_delay_mdeparture_delay_minfoarrival_delay_checkdeparture_delay_check
20613472316002367592887267-2407142343-316Ingolstadt Hbf|Ingolstadt Nord80030745Ingolstadt AudiBayernIngolstadt8505511.40745648.7904962024-07-14 23:50:002024-07-14 23:50:00NaNNaN00NaNon_timeon_time
2061348-5262352002503319170-2407142138-1916Nürnberg Hbf|Schwabach|Roth|Unterheckenhofen|Georgensgmünd|Mühlstetten|Pleinfeld|Ellingen(Bay)|Weißenburg(Bay)|Treuchtlingen|Pappenheim|Solnhofen|Dollnstein|Eichstätt Bahnhof|Adelschlag|Tauberfeld|Eitensheim|Gaimersheim80030745Ingolstadt AudiBayernIngolstadt8505511.40745648.7904962024-07-14 23:16:002024-07-14 23:17:002024-07-14 23:18:002024-07-14 23:19:0022NaNon_timeon_time
20613491884127837918246080-2407142324-2200Wendlingen(Neckar)80039835Merklingen - Schwäbische AlbBaden-WürttembergMerklingen891889.74087748.5211602024-07-14 23:38:002024-07-14 23:39:002024-07-14 23:38:002024-07-14 23:39:0000Bauarbeitenon_timeon_time
2061350-4498532330426324655-2407142201-14RE18Osnabrück Hbf|Osnabrück Altstadt|Bramsche|Bersenbrück|Quakenbrück|Essen(Oldb)|Cloppenburg|Ahlhorn|Großenkneten|Huntlosen|Sandkrug|Oldenburg(Oldb)Hbf|Rastede80031055JaderbergNiedersachsenJaderberg263498.18453853.3448782024-07-14 23:55:002024-07-14 23:56:002024-07-14 23:55:002024-07-14 23:56:0000Bauarbeitenon_timeon_time
2061351-5558360799253050120-2407142310-4RE18Wilhelmshaven|Sande|Varel(Oldb)80031055JaderbergNiedersachsenJaderberg263498.18453853.3448782024-07-14 23:33:002024-07-14 23:33:002024-07-14 23:33:002024-07-14 23:33:0000Bauarbeitenon_timeon_time
2061352-3877986638624297828-2407142237-4S9Bottrop Hbf|Bottrop-Boy|Gladbeck West80027955Herten (Westf)Nordrhein-WestfalenHerten456997.13905351.5975082024-07-14 23:17:002024-07-14 23:17:00NaNNaN00NaNon_timeon_time
20613533370285438001482281-2407142234-78Lübeck-Travemünde Strand|Lübeck-Travemünde Hafen|Lübeck-Travem. Skandinavienkai|Lübeck-Kücknitz|Lübeck-Dänischburg IKEA|Lübeck Hbf80037755Lübeck-MoislingSchleswig-HolsteinLübeck2356010.62950053.8368002024-07-14 23:10:002024-07-14 23:11:002024-07-14 23:11:002024-07-14 23:12:0011Informationon_timeon_time
2061354-8774053210575864323-2407142305-380Bad Oldesloe|Reinfeld(Holst)80037755Lübeck-MoislingSchleswig-HolsteinLübeck2356010.62950053.8368002024-07-14 23:17:002024-07-14 23:18:002024-07-14 23:17:002024-07-14 23:18:0000Informationon_timeon_time
2061355-1537118689903044118-2407142354-111NaN80015804Düsseldorf Flughafen TerminalNordrhein-WestfalenDüsseldorf404746.76697951.278517NaN2024-07-14 23:54:00NaNNaN00Information. (Quelle: zuginfo.nrw)on_timeon_time
20613562862161729195150146-2407142324-111NaN80015804Düsseldorf Flughafen TerminalNordrhein-WestfalenDüsseldorf404746.76697951.278517NaN2024-07-14 23:24:00NaN2024-07-14 23:24:0000Information. (Quelle: zuginfo.nrw)on_timeon_time

Duplicate rows

Most frequently occurring

IDlinepatheva_nrcategorystationstatecityziplonglatarrival_plandeparture_planarrival_changedeparture_changearrival_delay_mdeparture_delay_minfoarrival_delay_checkdeparture_delay_check# duplicates
0-1003145420136048192-2407100551-651Gera Hbf|Hermsdorf-Klosterlausnitz|Stadtroda|Jena-Göschwitz|Jena West80103662WeimarThüringenWeimar9942311.32645850.9914872024-07-10 06:49:002024-07-10 07:04:002024-07-10 06:51:002024-07-10 07:06:0022NaNon_timeon_time2
1-1008819848758697010-2407130714-226Grafing Bahnhof|Kirchseeon|Eglharting|Zorneding|Baldham|Vaterstetten|Haar|Gronsdorf|München-Trudering|München-Berg am Laim|München Leuchtenbergring|München Ost|München Rosenheimer Platz|München Isartor|München Marienplatz|München Karlsplatz|München Hbf (tief)|München Hackerbrücke|München Donnersbergerbrücke|München Hirschgarten|München-Laim80041582München-PasingBayernMünchen8124111.46187248.1498522024-07-13 07:59:002024-07-13 08:01:002024-07-13 08:02:002024-07-13 08:03:0032NaNon_timeon_time2
2-1009540259073221553-2407142134-106Starnberg|Starnberg Nord|Gauting|Stockdorf|Planegg|Gräfelfing|Lochham|München-Westkreuz|München-Pasing80041513München-Laim PbfBayernMünchen8063911.50366948.1443712024-07-14 21:59:002024-07-14 22:00:002024-07-14 22:01:002024-07-14 22:02:0022Bauarbeitenon_timeon_time2
3-1010076636343338093-2407101633-116Köln-Worringen|Köln-Blumenberg|Köln-Chorweiler Nord|Köln-Chorweiler|Köln Volkhovener Weg|Köln-Longerich|Köln Geldernstr./Parkgürtel|Köln-Nippes|Köln Hansaring|Köln Hbf80033681Köln Messe/DeutzNordrhein-WestfalenKöln506796.97500150.9408742024-07-10 16:59:002024-07-10 17:00:002024-07-10 16:59:002024-07-10 17:00:0000Information. (Quelle: zuginfo.nrw)on_timeon_time2
4-1012813851155274121-2407111424-718Maastricht|Meerssen|Valkenburg(NL)|Heerlen|Landgraaf|Eygelshoven Markt80028063HerzogenrathNordrhein-WestfalenHerzogenrath521346.09448650.8709162024-07-11 14:59:002024-07-11 15:00:00NaNNaN00NaNon_timeon_time2
5-1012813851155274121-2407141424-718Maastricht|Meerssen|Valkenburg(NL)|Heerlen|Landgraaf|Eygelshoven Markt80028063HerzogenrathNordrhein-WestfalenHerzogenrath521346.09448650.8709162024-07-14 14:59:002024-07-14 15:00:00NaNNaN00NaNon_timeon_time2
6-1014485518442214187-2407080436-746Ilmenau|Ilmenau Pörlitzer Höhe|Ilmenau-Roda|Elgersburg|Geraberg|Martinroda80102745Plaue (Thür)ThüringenPlaue9933810.90869850.7783932024-07-08 04:57:002024-07-08 05:06:002024-07-08 04:57:002024-07-08 05:06:0000NaNon_timeon_time2
7-1014485518442214187-2407090436-746Ilmenau|Ilmenau Pörlitzer Höhe|Ilmenau-Roda|Elgersburg|Geraberg|Martinroda80102745Plaue (Thür)ThüringenPlaue9933810.90869850.7783932024-07-09 04:57:002024-07-09 05:06:002024-07-09 04:57:002024-07-09 05:06:0000NaNon_timeon_time2
8-1014485518442214187-2407100436-746Ilmenau|Ilmenau Pörlitzer Höhe|Ilmenau-Roda|Elgersburg|Geraberg|Martinroda80102745Plaue (Thür)ThüringenPlaue9933810.90869850.7783932024-07-10 04:57:002024-07-10 05:06:002024-07-10 04:57:002024-07-10 05:06:0000NaNon_timeon_time2
9-1014485518442214187-2407120436-746Ilmenau|Ilmenau Pörlitzer Höhe|Ilmenau-Roda|Elgersburg|Geraberg|Martinroda80102745Plaue (Thür)ThüringenPlaue9933810.90869850.7783932024-07-12 04:57:002024-07-12 05:06:002024-07-12 04:57:002024-07-12 05:06:0000NaNon_timeon_time2